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Abstract. This pedagogical review addresses several issues related to statistical de- 
scription of gravitating systems in both static and expanding backgrounds, focusing on 
the latter. After briefly reviewing the results for the static background, I describe the 
key issues and recent progress in the context of gravitational clustering of collision-less 
particles in an expanding universe. The questions addressed include: (a) How does the 
power injected into the system at a given wave number spread to smaller and larger 
On : scales? (b) How does the power spectrum of density fluctuations behave asymptotically 

at late times? (c) What are the universal characteristics of gravitational clustering that 
are independent of the initial conditions and manifest at the late time evolution of the 
system? The review is intended for non cosmologists and will be of interest to people 
working in fluid mechanics, non linear dynamics and condensed matter physics. 



1 Introduction 



CO 

o 

CN 

. The statistical mechanics of systems dominated by gravity is of interest both 

from the theoretical and "practical" perspectives. Theoretically this field has 
close connections with areas of condensed matter physics, fluid mechanics, re- 
normalization group etc. and poses an incredible challenge as regards the basic 
formulation. From the practical point of view, the ideas find application in many 
different areas of astrophysics and cosmology, especially in the study of globular 
clusters, galaxies and gravitational clustering in the expanding universe. (For a 
general review of statistical mechanics of gravitating systems, see [Q; textbook 
description of the subject is available in [2|, ||. Review of gravitational clustering 
' in expanding background is available in 1 4 } and in several textbooks in cosmology 

[pi . There have been many attempts to understand these phenomena by different 
groups; see ||, 0, ||, Jl(|, jll) and the references cited therein.) Given the 
diversity of the subject, it will be useful to begin with a broad overview and a 
description of the issues which will be addressed in this review. 

I will concentrate mostly on the problem of gravitational clustering in the 
expanding universe which is one of the most active research areas in cosmology. 
However, to place this problem in context, it is necessary to discuss statistical 
mechanics of isolated gravitating systems (without any cosmological expansion) 
in some detail. In Part I of this review, I cover this aspect highlighting the 
important features but referring the reader to existing previous literature for 
details. Part II presents a more detailed description of gravitational clustering 
in the context of cosmology. The rest of the introduction will be devoted to a 
summary of different issues which will be expanded upon in the later sections. 
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Let me begin with the issues which arise in the study of isolated gravitating 
systems (like, say, a cluster of stars) treated as a collection of structure-less 
point particles. In Newtonian theory, the gravitational force can be described as 
a gradient of a scalar potential and the evolution of a set of particles under the 
action of gravitational forces can be described the equations 

Xi = -V0(Xi,t); V 2 (/. = 47rG^m l <5 D (x-x l ) (1) 

i 

where Xj is the position of the i— th particle, rrij is its mass. For sufficiently large 
number of particles, it is useful to investigate whether some kind of statistical 
description of such a system is possible. Such a description, however, is com- 
plicated by the long range, unscreened, nature of gravitational force. The force 
acting on any given particle receives contribution even from particles which are 
far away. If a self gravitating system is divided into two parts, the total energy 
of the system cannot be expressed as the sum of the gravitational energy of the 
components. Many of the conventional results in statistical physics are based on 
the extensivity of the energy which is clearly not valid for gravitating systems. 
To make progress, we have to use different techniques which are appropriate for 
each situation. 

To construct the statistical description of such a system, one should begin 
with the construction of the micro-canonical ensemble describing such a sys- 
tem. If the Hamiltonian of the system is H(pi,qi) then the volume g{E) of the 
constant energy surface H(pi,qi) = E will be of primary importance in the 
micro-canonical ensemble. The logarithm of this function will give the entropy 
S(E) = ]ng(E) and the temperature of the system will be T(E) = /3(E)- 1 = 
{dS/dE)~\ 

Systems for which a description based on canonical ensemble is possible, the 
Laplace transform of g(E) with respect to a variable (3 will give the partition 
function Z((3). It is, however, trivial to show that gravitating systems of interest 
in astrophysics cannot be described by a canonical ensemble M , j^] , [|l3| . Virial 
theorem holds for such systems and we have (2K+U) — where K and U are the 
total kinetic and potential energies of the system. This leads to E = K + U = 
—K; since the temperature of the system is proportional to the total kinetic 
energy, the specific heat will be negative: Cy = (8E/dT)y cx (BE /OK) < 0. On 
the other hand, the specific heat of any system described by a canonical ensemble 
Cy — (3 2 < (AE) 2 > will be positive definite. Thus one cannot describe self 
gravitating systems of the kind we are interested in by canonical ensemble. 

One may attempt to find the equilibrium configuration for self gravitating 
systems by maximizing the entropy S(E) or the phase volume g{E). It is again 
easy to show that no global maximum for the entropy exists for classical point 
particles interacting via Newtonian gravity. To prove this, we only need to con- 
struct a configuration with arbitrarily high entropy which can be achieved as 
follows: Consider a system of N particles initially occupying a region of finite 
volume in phase space and total energy E. We now move a small number of 
these particles (in fact, a pair of them, say, particles 1 and 2 will do) arbitrarily 
close to each other. The potential energy of interaction of these two particles, 
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—Gm\m,ijr\2^ will become arbitrarily high as r\2 — > 0. Transferring some of this 
energy to the rest of the particles, we can increase their kinetic energy without 
limit. This will clearly increase the phase volume occupied by the system with- 
out bound. This argument can be made more formal by dividing the original 
system into a small, compact core and a large diffuse halo and allowing the core 
to collapse and transfer the energy to the halo. 

The absence of the global maximum for entropy — as argued above — de- 
pends on the idealization that there is no short distance cut-off in the inter- 
action of the particles, so that we could take the limit r±2 — ► 0. If we assume, 
instead, that each particle has a minimum radius a, then the typical lower bound 
to the gravitational potential energy contributed by a pair of particles will be 
—Gmim2/2a. This will put an upper bound on the amount of energy that can 
be made available to the rest of the system. 

We have also assumed that part of the system can expand without limit — in 
the sense that any particle with sufficiently large energy can move to arbitrarily 
large distances. In real life, no system is completely isolated and eventually one 
has to assume that the meandering particle is better treated as a member of 
another system. One way of obtaining a truly isolated system is to confine the 
system inside a spherical region of radius R with, say, reflecting wall. 

The two cut-offs a and R will make the upper bound on the entropy finite, but 
even with the two cut-offs, the primary nature of gravitational instability cannot 
be avoided. The basic phenomenon described above (namely, the formation of 
a compact core and a diffuse halo) will still occur since this is the direction 
of increasing entropy. Particles in the hot diffuse component will permeate the 
entire spherical cavity, bouncing off the walls and having a kinetic energy which 
is significantly larger than the potential energy. The compact core will exist as a 
gravitationally bound system with very little kinetic energy. A more formal way 
of understanding this phenomena is based on the virial theorem for a system 
with a short distance cut-off confined to a sphere of volume V. In this case, the 
virial theorem will read as 

2T+U = 3PV + $ (2) 

where P is the pressure on the walls and $ is the correction to the potential 
energy arising from the short distance cut-off. This equation can be satisfied in 
essentially three different ways. If T and U are significantly higher than 3PV, 
then we have 2T + U « which describes a self gravitating systems in standard 
virial equilibrium but not in the state of maximum entropy. If T U and 
3PV ^> <P, one can have 2T w 3PV which describes an ideal gas with no 
potential energy confined to a container of volume V; this will describe the hot 
diffuse component at late times. If T <C U and 3PV <C ^, then one can have 
U ~ <P describing the compact potential energy dominated core at late times. 
In general, the evolution of the system will lead to the production of the core 
and the halo and each component will satisfy the virial theorem in the form (^) . 
Such an asymptotic state with two distinct phases flli) is quite different from 
what would have been expected for systems with only short range interaction. 
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Considering its importance, I shall briefly describe in section^ a toy model which 
captures the essential physics of the above system in an exactly solvable context. 

The above discussion focussed on the existence of global maximum to the 
entropy and we proved that it does not exist in the absence of two cut-offs. 
It is, however, possible to have local extrema of entropy which are not global 
maxima. Intuitively, one would have expected the distribution of matter in the 
configuration which is a local extrema of entropy to be described by a Boltz- 
mann distribution, with the density given by p(x) cx exp[— /34>(x)] where <fi is 
the gravitational potential related to p by Poisson equation. This is indeed true 
and a formal proof will be given in section |^. This configuration is usually called 
the isothermal sphere (because it can be shown that, among all solutions to this 
equation, the one with spherical symmetry maximizes the entropy) and since it 
is a local maximum of entropy, it deserves careful study. I will describe briefly 
some of the interesting features of the isothermal spheres in section || and this 
configuration will play a dominant role throughout our review. The second (func- 
tional) derivative of the entropy with respect to the configuration variables will 
determine whether the local extremum of entropy is a local maximum or a saddle 
point jl5| , jl6| and some of these results are described at the end of section 0. 
The relevance of the long range of gravity in all the above phenomena can be un- 
derstood by studying model systems with an attractive potential varying as r~ a 
with different values for a. Such studies confirm the results and interpretation 
given above; (see [JL7| and references cited therein). 

Let us now consider the situation in the context of an expanding background 
described in Part II which is the main theme of the review. There is considerable 
amount of observational evidence to suggest that one of the dominant energy 
densities in the universe is contributed by self gravitating point particles. The 
smooth average energy density of these particles drive the expansion of the uni- 
verse while any small deviation from the homogeneous energy density will cluster 
gravitationally. One of the central problems in cosmology is to describe the non 
linear phases of this gravitational clustering starting from a initial spectrum of 
density fluctuations. It is often enough (and necessary) to use a statistical de- 
scription and relate different statistical indicators (like the power spectra, nth 
order correlation functions ....) of the resulting density distribution to the sta- 
tistical parameters (usually the power spectrum) of the initial distribution. The 
relevant scales at which gravitational clustering is non linear are less than about 
10 Mpc (where 1 Mpc = 3 x 10 24 cm is the typical separation between galaxies 
in the universe) while the expansion of the universe has a characteristic scale 
of about few thousand Mpc. Hence, non linear gravitational clustering in an 
expanding universe can be adequately described by Newtonian gravity provided 
the rescaling of lengths due to the background expansion is taken into account. 
This is easily done by introducing a proper coordinate for the i— th particle r^, 
related to the comoving coordinate x^, by = a(t)xi with a(t) describing the 
stretching of length scales due to cosmic expansion. The Newtonian dynamics 
works with the proper coordinates which can be translated to the behaviour of 
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the comoving coordinate Xj by this rescaling. (Some basic results in cosmology 
are summarized in Appendix A.) 

As to be expected, cosmological expansion completely changes the nature 
of the problem because of several new factors which come in: (a) The problem 
has now become time dependent and it will be pointless to look for equilibrium 
solutions in the conventional sense of the word, (b) On the other hand, the 
expansion of the universe has a civilizing influence on the particles and acts 
counter to the tendency of gravity to make systems unstable, (c) In any small 
local region of the universe, one would assume that the conclusions describing a 
finite gravitating system will still hold true approximately. In that case, particles 
in any small sub region will be driven towards configurations of local extrema 
of entropy (say, isothermal spheres) and towards global maxima of entropy (say, 
core- halo configurations). 

An extra feature comes into play as regards the expanding halo from any sub 
region. The expansion of the universe acts as a damping term in the equations 
of motion and drains the particles of their kinetic energy — which is essentially 
the lowering of temperature of any system participating in cosmic expansion. 
This, in turn, helps gravitational clustering since the potential wells of nearby 
sub regions can capture particles in the expanding halo of one region when the 
kinetic energy of the expanding halo has been sufficiently reduced. 

The actual behaviour of the system will, of course, depend on the form of a(t). 
However, for understanding the nature of clustering, one can take a(t) cx t 2 / 3, 
which describes a matter dominated universe with critical density (see Appendix 
A). Such a power law has the advantage that there is no intrinsic scale in the 
problem. Since Newtonian gravitational force is also scale free, one would ex- 
pect some scaling relations to exist in the pattern of gravitational clustering. 
Incredibly enough, converting this intuitive idea into a concrete mathematical 
statement turns out to be non trivial and difficult. I shall discuss several at- 
tempts to give concrete shape to this idea in sections 5.1, @ and || but there is 
definite scope for further work in this direction. 

To make any progress we need a theoretical formulation which will relate sta- 
tistical indicators in the non linear regime of clustering to the initial conditions. 
In particular, we need a robust prescription which will allow us to obtain the 
two-point correlation function and the nonlinear power spectrum from the initial 
power spectrum. Fortunately, this problem has been solved to a large extent and 
hence one can use this as a basis for attacking several other key questions. There 
are four key theoretical questions which are of considerable interest in this area 
that I will focus on: 



• If the initial power spectrum is sharply peaked in a narrow band of wave- 
lengths, how does the evolution transfer the power to other scales? This is, 
in some sense, analogous to determining the Green function for the gravita- 
tional clustering except that superposition will not work in the non linear 
context. 

• What is the asymptotic nature of evolution for the self gravitating system in 
an expanding background? In particular, how can one connect up the local 
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behaviour of gravitating systems to the overall evolution of clustering in the 
universe? (If we assume that the isothermal spheres play an important role 
in the local description of gravitating system, we would expect a strong trace 
of it to survive even in the context of cosmological clustering. This is indeed 
true as I shall show in sections ^, [7] and || but only in the asymptotic limit, 
under certain assumptions.) 

• Does the gravitational clustering at late stages wipe out the memory of 
initial conditions or does the late stage evolution depend on the initial power 
spectrum of fluctuations? 

• Do the virialized structures formed in an expanding universe due to grav- 
itational clustering have any invariant properties? Can their structure be 
understood from first principles? 

All the above questions are, in some sense, open and thus are good research 
problems. I will highlight the progress which has been made and give references 
to original literature for more detailed discussion. 



Part I: Gravitational clustering in static back- 
grounds 

2 Phases of the self gravitating system 

As described in section [j] the statistical mechanics of finite, self gravitating, sys- 
tems have the following characteristic features: (a) They exhibit negative specific 
heat while in virial equilibrium, (b) They arc inherently unstable to the forma- 
tion of a core-halo structure and global maximum for entropy does not exist 
without cut-offs at short and large distances, (c) They can be broadly charac- 
terized by two phases — one of which is compact and dominated by potential 
energy while the other is diffuse and behaves more or less like an ideal gas. The 
purpose of this section is to describe a simple toy model which exhibits all these 
features and mimics a self gravitating system Q. 

Consider a system with two particles described by a Hamiltonian of the form 

,„ „ N P 2 p 2 Gm 2 

tf(P,Q; P ,r) = - + !;-— (3) 

where (Q, P) are coordinates and momenta of the center of mass, (r, p) are the 
relative coordinates and momenta, M — 2m is the total mass, fi = to/2 is the 
reduced mass and m is the mass of the individual particles. This system may 
be thought of as consisting of two particles (each of mass to) interacting via 
gravity. We shall assume that the quantity r varies in the interval (a,R). This 
is equivalent to assuming that the particles are hard spheres of radius a/2 and 
that the system is confined to a spherical box of radius R. We will study the 
"statistical mechanics" of this simple toy model. 



Statistical mechanics of gravitating systems 



7 



To do this, we shall start with the volume g(E) of the constant energy surface 
H = E. Straightforward calculation gives 



g(E) = AR 3 





Gm 2 ' 


/ r dr 


E+ 


la 


r 



(4) 



where A = (647r 5 m 3 /3). The range of integration in (|]) should be limited to the 
region in which the expression in the square brackets is positive. So we should 
use r max = (Gm 2 / \E\) if (-Gm 2 /a) < E < (-Gm 2 /R), and use r max = R if 
(-Gm 2 /R) < E < +00. Since H > (-Gm 2 /a), we trivially have g(E) = for 
E < (—Gm 2 /a). The constant A is unimportant for our discussions and hence 
will be omitted from the formulas hereafter. The integration in (^) gives the 
following result: 



9(E) 
(Gm 2 ) 3 



-(-E)-i (1 + , (-Gm 2 /a) <E< (—Gm 2 / R) 



1 



RE 
Gm 1 



- 1 



_oE_ 

Gin' 2 > 



, (-Gm 2 /R) < E < 00. 



(5) 

This function g(E) is continuous and smooth at E = (—Gm 2 /R). We define the 
entropy S(E) and the temperature T(E) of the system by the relations 



S(E) = hxg(E); T~\E) = 0(E) 



dS(E) 
dE ' 



(0) 



All the interesting thermodynamic properties of the system can be understood 
from the T(E) curve. 

Consider first the case of low energies with (—Gm 2 /a) < E < (—Gm 2 /R). 
Using (||) and (||) one can easily obtain T(E) and write it in the dimensionless 
form as 

3 r 



t(e) 



(7) 



where we have defined t — (aT/Gm 2 ) and e = (aE /Gm 2 ). 

This function exhibits the peculiarities characteristic of gravitating systems. 
At the lowest energy admissible for our system, which corresponds to e = — 1, the 
temperature t vanishes. This describes a tightly bound low temperature phase 
of the system with negligible random motion. The t(e) is clearly dominated by 
the first term of (Q) for e ~ — 1. As we increase the energy of the system, the 
temperature increases, which is the normal behaviour for a system. This trend 
continues up to 



-0.36 



(8) 



at which point the t(e) curve reaches a maximum and turns around. As we in- 
crease the energy further the temperature decreases. The system exhibits negative 
specific heat in this range. 

Equation (Q) is valid from the minimum energy (—Gm 2 /a) all the way up to 
the energy (—Gm 2 /R). For realistic systems, R^> a and hence this range is quite 
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wide. For a small region in this range, [from (— Gm 2 /a) to (— 0.36Gm 2 /a)] we 
have positive specific heat; for the rest of the region the specific heat is negative. 
The positive specific heat region owes its existence to the nonzero short distance 
cutoff. If we set a — 0, the first term in ^ will vanish; we will have t oc (— e _1 ) 
and negative specific heat in this entire domain. 

For E > (—Gm 2 /R), we have to use the second expression in (||) for g(E). 
In this case, we get: 



i(e) 



3[(l + e )3-f(l + £e) 2 ] 1 



(9) 



(l + e) 3 -(l + fe) 3 

This function, of course, matches smoothly with (0) at e = —{a/R). As we 
increase the energy, the temperature continues to decrease for a little while, 
exhibiting negative specific heat. However, this behaviour is soon halted at some 
e = e 2 , say. The t(e) curve reaches a minimum at this point, turns around, and 
starts increasing with increasing e. We thus enter another (high-temperature) 
phase with positive specific heat. From (||) it is clear that t ~ (l/2)e for large e. 
(Since E — {3/2)NkT for an ideal gas, we might have expected to find t ~ (l/3)e 
for our system with N = 2 at high temperatures. This is indeed what we would 
have found if we had defined our entropy as the logarithm of the volume of 
the phase space with H < E. With our definition, the energy of the ideal gas is 
actually E = [(3/2)JV- l]kT; hence we get t = (l/2)e when N = 2). The form of 
the t{e) for (a/R) = 10~ 4 is shown in figure [l] by the dashed curve. The specific 
heat is positive along the portions AB and CD and is negative along BC. 

The overall picture is now clear. Our system has two natural energy scales: 
Ei = {—Gm 2 /a) and E 2 = (—Gm 2 /R). For E » E 2 , gravity is not strong 
enough to keep r < R and the system behaves like a gas confined by the con- 
tainer; we have a high temperature phase with positive specific heat. As we lower 
the energy to E ~ E 2 , the effects of gravity begin to be felt. For E\ < E < E 2 , 
the system is unaffected by either the box or the short distance cutoff; this is 
the domain dominated entirely by gravity and we have negative specific heat. 
As we go to E ~ Ei , the hard core nature of the particles begins to be felt and 
the gravity is again resisted. This gives rise to a low temperature phase with 
positive specific heat. 

We can also consider the effect of increasing R, keeping a and E fixed. Since 
we imagine the particles to be hard spheres of radius {a/2), we should only 
consider R > 2a. It is amusing to note that, if 2 < (R/a) < (\/3 + 1), there is 
no region of negative specific heat. As we increase R, this negative specific heat 
region appears and increasing R increases the range over which the specific heat 
is negative. Suppose a system is originally prepared with some E and R values 
such that the specific heat is positive. If we now increase R, the system may 
find itself in a region of negative specific heat. This suggests the possibility that 
an instability may be triggered in a constant energy system if its radius increases 
beyond a critical value. We will see later that this is indeed true. 

Since systems described by canonical distribution cannot exhibit negative 
specific heat, it follows that canonical distribution will lead to a very different 
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Fig. 1. The relation between temperature and energy for a model mimicking self grav- 
itating systems. The dashed line is the result for micro-canonical ensemble and the 
solid line is for canonical ensemble. The negative specific heat region, BC, in the micro- 
canonical description is replaced by a phase transition in the canonical description. See 
text for more details 



physical picture for this range of (mean) energies E\ < E < E^. It is, therefore, 
of interest to look at our system from the point of view of canonical distribution 
by computing the partition function. In the partition function 

Z(J3) = J d 3 Pd 3 pd 3 Qd 3 rexp(-f3H) (10) 

the integrations over P, p and Q can be performed trivially. Omitting an overall 
constant which is unimportant, we can write the answer in the dimensionless 
form as 

Z (t)=t 3 (-) f dxx 2 exp(— ) (11) 



where t is the dimensionless temperature defined in (0) . Though this integral 
cannot be evaluated in closed form, all the limiting properties of Z{(3) can be 
easily obtained from (pd|). 
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The integrand in ( |1 1| ) is large for both large and small x and reaches a 
minimum for x = x m = (l/2t). At high temperatures, x m < 1 and hence the 
minimum falls outside the domain of integration. The exponential contributes 
very little to the integral and we can approximate Z adequately by 



R 



3 r R/a 



r ( -J J dxx 2 



t 3 fR\ 6 ( 3a . 
= 3 a 1+ 2^)' (12) 



On the other hand, if x m > 1 the minimum lies between the limits of the 
integration and the exponential part of the curve dominates the integral. We 
can easily evaluate this contribution by a saddle point approach, and obtain 



(13) 



As we lower the temperature, making x m cross 1 from below, the contribution 



switches over from ( |12| ) to (13). The transition is exponentially sharp. The critical 
temperature at which the transition occurs can be estimated by finding the 
temperature at which the two contributions are equal. This occurs at 

_ 1 1 

tc ~ 3\MR/a)' (W) 

For t < t c , we should use (|l3|) and for t > t c we should use (p"2|). 

Given Z(f3) all thermodynamic functions can be computed. In particular, the 
mean energy of the system is given by E((3) = —(9 In Z/dj3). This relation can 
be inverted to give the T(E) which can be compared with the T(E) obtained 
earlier using the micro-canonical distribution. From (|l^ ) and (^) we get, 

eW = ^=4t-l (15) 

for t < t c and 

^)-3t-|| (16) 

for t > t c . Near t w t c , there is a rapid variation of the energy and we cannot 
use either asymptotic form. The system undergoes a phase transition at t — t c 
absorbing a large amount of energy 

The specific heat is, of course, positive throughout the range. This is to be 
expected because canonical ensemble cannot lead to negative specific heats. 

The T — E curves obtained from the canonical (unbroken line) and micro- 
canonical (dashed line) distributions are shown in figure |. (For convenience, we 
have rescaled the T — E curve of the micro-canonical distribution so that e ~ 3i 
asymptotically.) At both very low and very high temperatures, the canonical 
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and micro-canonical descriptions match. The crucial difference occurs at the in- 
termediate energies and temperatures. Micro-canonical description predicts neg- 
ative specific heat and a reasonably slow variation of energy with temperature. 
Canonical description, on the other hand, predicts a phase transition with rapid 
variation of energy with temperature. Such phase transitions are accompanied 
by large fluctuations in the energy, which is the main reason for the disagreement 
between the two descriptions Q, [ fL2[ , [ |l3fl . 

Numerical analysis of more realistic systems confirm all these features. Such 
systems exhibit a phase transition from the diffuse virialized phase to a core 
dominated phase when the temperature is lowered below a critical value flifl . 
The transition is very sharp and occurs at nearly constant temperature. The 
energy released by the formation of the compact core heats up the diffuse halo 
component. 



3 Mean field description of gravitating systems 



The analysis in the previous two sections shows that there is no global maximum 
for the entropy for a self gravitating system of point particles and the evolu- 
tion will proceed towards the formation of a core halo configuration and will 
continue inexorably in the absence of cut-offs. It is, however, possible to find 
configurations for these systems which are local maxima of the entropy. This 
configuration, called the isothermal sphere, will be of considerable significance 
in our discussions. 

Consider a system of N particles interacting with each other through the 
two-body potential U(x, y). The entropy S of this system, in the micro-canonical 
description, is defined through the relation 



9(E) = jn] d 3N xd'» P HE- 



j3N r 



(18) 

wherein we have performed the momentum integrations and replaced (3N/2 — 1) 
by (3N/2). We shall approximate the expression in ([l8]) in the following manner. 

Let the spatial volume V be divided into M (with M <C N) cells of equal 
size, large enough to contain many particles but small enough for the potential to 
be treated as a constant inside each cell. Instead of integrating over the particle 
coordinates (xx,X2, ...,xjv) we shall sum over the number of particles n a in the 
cell centered at x a (where a = 1,2, ...,M). Using the standard result that the 
integration over (Nl)^ 1 d 3N x can be replaced by 



ni=l n 2 — 1 n M — l v / \ a / v / 
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and ignoring the unimportant constant A, we can rewrite ([18]) as 



1 M 

# - - E n a U ab n b 



N 



oo oo 



E E E M^-EM ex ps[KJ] 



III- 1 712— 1 TIM — 1 



where 



S[K}] = ^ln 



1 M 

E - - ^7i a t/(x Q ,x b )n b 



ri„ In 

° I e^ 



(20) 



(21) 



In arriving at the last expression we have used the Sterling's approximation for 
the factorials. The mean field limit is now obtained by retaining in the sum in 
( p0| ) only the term for which the summand reaches the maximum value, subject 
to the constraint on the total number. That is, we use the approximation 



E< 



,S[n a ] p S[n a , max ] 



where n ajmax is the solution to the variational problem 

M 



SS 
Sn a 



= with n a = N. 



(22) 



(23) 



a=l 



Imposing this constraint by a Lagrange multiplier and using the expression ( pi| ) 
for 5, we obtain the equation satisfied by n a , max : 



1 M 

— ^ [7(x a , x 6 )n b ,ma X + In 



6=1 



r 



constant 



(24) 



where we have defined the temperature T as 



1 _ 3iV 



I 1 M \ 

V J 



= (3. 



(25) 



We see from (^l|) that this expression is also equal to (dS/dE); therefore T is 
indeed the correct thermodynamic temperature. We can now return back to the 
continuum limit by the replacements 



M 



M _ ^ M 

a—1 



V 



(26) 
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In this limit, the extremum solution ( p4| ) is given by 

p(x)=Aexp(-/3^(x)); where 0(x) - J d 3 yU(^ y)p(y) (27) 
which, in the case of gravitational interactions, becomes 

p(x) = Aexp(-/30(x)); 0(x) = -g| (28) 



Equation (|28|) represents the configuration of extremal entropy for a gravitating 
system in the mean field limit. The constant [3 is already determined through 
( p5[ ) in terms of the total energy of the system. The constant A has to be fixed 
in terms of the total number (or mass) of the particles in the system. 

The various manipulations in ( p"8| ) to ( p4| ) tacitly assume that the expressions 
we are dealing with are finite. But for gravitational interactions without a short 
distance cut-off, the quantity e s - and hence all the terms we have been handling 
- are divergent. We should, therefore, remember that a short distance cut-off is 



needed to justify the entire procedure Q9|, |17[, [18] and that ([28|) — which is 
based on a strict r -1 potential and does not incorporate any such cutoff — can 
only be approximately correct. We will now study some of the properties of the 



solution of fl28|) which is a local extremum of the entropy. 
4 Isothermal sphere 

The extremum condition for the entropy, equation ((2^), is equivalent to the 
differential equation for the gravitational potential: 

V 2 = 47rGp c e-W( x )-^°)] (29) 

Given the solution to this equation, all other quantities can be determined. As 
we shall see, this system shows several peculiarities. 

It is convenient to introduce the length, mass and energy scale by the defi- 
nitions 

L = (4vrGp c /?) 1/2 , M =4tt Pc LI O = /T 1 = ^ (30) 

where p c = p(0). All other physical variables can be expressed in terms of the 
dimensionless quantities 

x=^—, n=— , m = M ^ , y = p [</> - (f> (0)] . (31) 
L p c Mo 

In terms of y{x) the isothermal equation ( p9| ) becomes 

with the boundary condition y(0) = y'(0) = 0. Let us consider the nature of 
solutions to this equation. 
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By direct substitution, we see that n = (2/x 2 ) , m = 2x,y = 2lnx satisfies 
these equations. This solution, however, is singular at the origin and hence is not 
physically admissible. The importance of this solution lies in the fact that other 
(physically admissible) solutions tend to this solution Q, |ll| for large values 
of x. This asymptotic behavior of all solutions shows that the density decreases 
as (\/r 2 ) for large r implying that the mass contained inside a sphere of radius 
r increases as M{r) a r at large r. To find physically useful solutions, it is 
necessary to assume that the solution is cutoff at some radius R. For example, 
one may assume that the system is enclosed in a spherical box of radius R. In 
what follows, it will be assumed that the system has some cutoff radius R. 

The equation (|3^) is invariant under the transformation y — ► y + a ; x — > kx 
with k 2 = e a . This invariance implies that, given a solution with some value of 
2/(0), we can obtain the solution with any other value of yjfi) by simple rescaling. 
Therefore, only one of the two integration constants in ( |32| ) is really non-trivial. 
Hence it must be possible to reduce the degree of the equation from two to one 
by a judicious choice of variables fl9|| . One such set of variables are: 

to nx 3 nx 2 . . 

v — — ; u — = . (33) 



In terms of v and u, equation (29) becomes 

u dv = (u-1) - 34 . 
v du (u + v — 3) 

The boundary conditions y(0) = y'(0) = translate into the following: v is zero 
at u = 3, and (dv/du) = —5/3 at (3,0). The solution v (u) has to be obtained 
numerically: it is plotted in figure]^ as the spiraling curve. The singular points of 
this differential equation are given by the intersection of the straight lines u — 1 
and u + v = 3 on which, the numerator and denominator of the right hand side 
of ( |34| ) vanishes; that is, the singular point is at u s — 1, v s = 2 corresponding to 
the solution n — (2/x 2 ),m = 2x. It is obvious from the nature of the equations 
that the solutions will spiral around the singular point. 

The nature of the solution shown in figure || allows us to put an interesting 
bounds on physical quantities including energy. To see this, we shall compute 
the total energy E of the isothermal sphere. The potential and kinetic energies 
are 

f R GM(r) dM , GM 2 f xo 

U = — I dr = / mnxdx 

Jo r dr Lo J 

3M 3GM 2 . . GM 2 i f x ° 2l 

K = olT = o-T^ m *o - / nx 2 dx (35) 

2 (i 2 Lo L 2 J 

where xo = R/Lo- The total energy is, therefore, 

GM 2 f x ° 



GM 2 f x ° , , 
E = K + U = ——2- / dx(3nx 2 - 2mnx) 
2L Jo 

GM 2 f X0 ^ d ro ^ 3 GM 2 



. dx—{2nx - 3to} = — - — {n x - -to } (36) 
ZLq Jo ax Lo 2 
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where hq = n(xo) and mo = m(xo). The dimensionless quantity (RE/GM 2 ) is 
given by 

*-!&-;>>-§>■ < 37 » 

Note that the combination (RE/GM 2 ) is a function of (u,v) alone. Let us now 
consider the constraints on A. Suppose we specify some value for A by specifying 
R, E and M. Then such an isothermal sphere must lie on the curve 



1 / 3\ RE 



v = — it — — 



A V 2 J ' GM 2 



(38) 



which is a straight line through the point (1.5,0) with the slope A -1 . On the 
other hand, since all isothermal spheres must lie on the u—v curve, an isothermal 
sphere can exist only if the line in (Ba) intersects the u — v curve. 




Fig. 2. Bound on RE/GM 2 for the isothermal sphere 



For large positive A (positive E) there is just one intersection. When A = 0, 
(zero energy) we still have a unique isothermal sphere. (For A = 0, equation (pq) 
is a vertical line through u = 3/2.). When A is negative (negative E), the line can 
cut the u — v curve at more than one point; thus more than one isothermal sphere 
can exist with a given value of A. [Of course, specifying M, i?, E individually will 
remove this non- uniqueness] . But as we decrease A (more and more negative 
E) the line in (|38| ) will slope more and more to the left; and when A is smaller 
than a critical value A c , the intersection will cease to exist. Thus no isothermal 
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sphere can exist if (RE /GM 2 ) is below a critical value AcT This fact follows 
immediately from the nature of u — v curve and equation (p8). The value of A c 
can be found from the numerical solution in figure. It turns out to be about 
(-0.335). 

The isothermal sphere has a special status as a solution to the mean field 
equations. Isothermal spheres, however, cannot exist if (RE /GM 2 ) < —0.335. 
Even when (RE /GM 2 ) > —0.335, the isothermal solution need not be stable. 
The stability of this solution can be investigated by studying the second variation 
of the entropy. Such a detailed analysis shows that the following results are 
true ||, & 01 • (i) Systems with (RE/GM 2 ) < -0.335 cannot evolve into 
isothermal spheres. Entropy has no extremum for such systems, (ii) Systems 
with ((RE/GM 2 ) > -0.335) and (p(0) > 709 p(R)) can exist in a meta-stable 
(saddle point state) isothermal sphere configuration. Here p(0) and p(R) denote 
the densities at the center and edge respectively. The entropy extrema exist 
but they are not local maxima, (iii) Systems with ((RE/GM 2 ) > —0.335) and 
(p(0) < 709 p(R)) can form isothermal spheres which are local maximum of 
entropy. 

Part II. Gravitational clustering in expanding uni- 
verse 



Let us next consider the gravitational clustering of a system of collision-less 
point particles in an expanding universe which poses several challenging the- 
oretical questions. Though the problem can be tackled in a 'practical' manner 
using high resolution numerical simulations, such an approach hides the physical 
principles which govern the behaviour of the system. To understand the physics, 
it is necessary that we attack the problem from several directions using ana- 
lytic and semi analytic methods. These sections will describe such attempts and 
will emphasize the semi analytic approach and outstanding issues, rather than 
more well established results. Some basic results in cosmology are summarized 
in Appendix A. 



5 Gravitational clustering at nonlinear scales 

The expansion of the universe sets a natural length scale (called the Hubble ra- 
dius) dn — c(a/a)^ 1 which is about 4000 Mpc in the current universe. Since the 
non linear effects due to gravitational clustering occur at significantly smaller 
length scales, it is possible to use Newtonian gravity to describe these phe- 
nomena. In any region small compared to dn one can set up an unambiguous 

1 This derivation is due to the author |L6|. It is surprising that Chandrasekhar, who 
has worked out the isothermal sphere in uv coordinates as early as 1939, missed 
discovering the energy bound shown in figure ^. Chandrasekhar Jl!| has the uv 
curve but does not over-plot lines of constant A. If he had done that, he would have 
discovered Antonov instability decades before Antonov did ||15[. 
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coordinate system in which the proper coordinate of a particle r(t) — a(t)x(t) 
satisfies the Newtonian equation r = — V r <£ where <P is the gravitational poten- 
tial. Expanding r and writing <P = <Pfwn + 4> where ^frw is due to the smooth 
(mean) density of matter and (f> is due to the perturbation in the density, we get 

ax + 2ax + ax = — V r ^FRW — V r = — V r ^FRW — a _1 V x (39) 

The first terms on both sides of the equation (ox and — V r ^FR\v) should match 
since they refer to the global expansion of the background FRW universe (see 



equation (115) of Appendix A). Equating them individually gives the results 



x + 2-x = — -V x (f> ; <Z>frw = -7r-r = ^~Pb^ (40) 

a a z 2 a 6 

where cf> is the gravitational potential generated by the perturbed, Newtonian, 
mass density through 

V 2 x <j) = AirGa 2 {Sp) = AirGp b a 2 d. (41) 

If Xj(t) is the trajectory of the i— th particle, then equations for gravitational 
clustering in an expanding universe, in the Newtonian limit, can be summarized 
as 

Xi + — x< = „-V x <& V 2 x (j) = 4nGa 2 Pb S (42) 

a a z 

where Pb(t) is the smooth background density of matter. We stress that, in 
the non-relativistic limit, the perturbed potential <j) satisfies the usual Poisson 
equation. 

Usually one is interested in the evolution of the density contrast S (t, x) = 
[p(t, x) — Pb(t)]/ Pb(t) rather than in the trajectories. Since the density contrast 
can be expressed in terms of the trajectories of the particles, it should be possible 
to write down a differential equation for <5(i, x) based on the equations for the 
trajectories Xj(£) derived above. It is, however, somewhat easier to write down 
an equation for 6k(t) which is the spatial Fourier transform of S(t, x). To do this, 
we begin with the fact that the density p(x, t) due to a set of point particles, 
each of mass to, is given by 

p(x, i ) = -^^o D [x-x i (t)] (43) 

where x,(t) is the trajectory of the ith particle. To verify the a -3 normalization, 
we can calculate the average of p(x, t) over a large volume V. We get 

f d 3 x , , m ( N\ M po 

^)^j^p^) = W) {v) = M = i (44) 

where N is the total number of particles inside the volume V and M = Nm 
is the mass contributed by them. Clearly p b cx a -3 , as it should. The density 
contrast S(x,t) is related to p(x, t) by 

l + < 5(x,t) = ^^ = -^Vfo[x-x < (t)]= /dqfe[x-x T («,q)]. (45) 
p b N ^ J 
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In arriving at the last equality we have taken the continuum limit by replacing: 
(i) Xj(i) by Xx(t, q) where q stands for a set of parameters (like the initial 
position, velocity etc.) of a particle; for simplicity, we shall take this to be initial 
position, (ii) (V/N) by d 3 q since both represent volume per particle. Fourier 
transforming both sides we get 



S k (t) = J d 3 xe~ jkx c5(x, t) = J d 3 q exp[-tk.xr(t,q)] - (27r) 3 ,5 D (k) 



(46) 



Differentiating this expression, and using the equation of motion (|43) for the 

11): 



trajectories give, after straightforward algebra, the equation (see j2l|, |2 

Sk + 2-6k = \ [ d 3 qe~ ik - XT(t,q) | lk . V - a 2 (k • x T ) 2 } 
a a'' J 

which can be further manipulated to give 



5 k + 2-S k = AmGp h b k + A k - B k 
a 



with 



A k = inGpt 



d 3 k' 



<5k'^k-k' 



k.k' 



k' 2 



(2tt) 3 

£? k = J d 3 q (k.xy) 2 exp [— ik.XT(i, q 



(47) 



(48) 



(49) 
(50) 



This equation is exact but involves Xy(t, q) on the right hand side and hence 
cannot be considered as closed. 

The structure of ( f48| ) and (|50|) can be simplified if we use the perturbed 
gravitational potential (in Fourier space) <f> k related to 5 k by 



<5 k = - 



AnGp b a 2 \4irGp 



3H 2 



fc 2 a0k 



(51) 



and write the integrand for A k in the symmetrised form as 



6v5 



k'Ok-k' 



k.k' 



= ^k'^k-k' 



k.k' k.(k-k') 



k' 2 



k' 2 

|k-k' 



Ik k 



/|2 



'12 



1 / 2a 



2 \3H 2 



3k"fk-k' 



(k - k') 2 k.k' + k' 2 (k 2 - k.k') 
fc 2 (k.k' + fc' 2 )-2(k.k') 2 l 



(52) 



In terms of </>k (with k' = (k/2) + p), equation (|4^) becomes, 



1 



. a ■ 

2a 2 J (2tt) 3 ' 



d 3 p 



rf 3 c 



k+p^ik-p 

2 



k.X 



v k . x 



(53) 



where x = x-r (t, q) . We shall now consider several applications of this equation. 
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5.1 Application 1: 'Renormalizability' of gravity 

Gravitational clustering in an expanding universe brings out an interesting fea- 
ture about gravity which can be described along the following lines. Let us 
consider a large number of particles which are interacting via gravity in an ex- 
panding background and form bound gravitating systems. At some time t, let 
us assume that a fraction / of the particles are in virialized, self-gravitating 
clusters which are reasonably immune to the effect of expansion. Imagine that 
we replace each of these clusters by single particles at their centers of mass with 
masses equal to the total mass of the corresponding clusters. (The total number 
of particles have now been reduced but, if the original number was sufficiently 
large, we may assume that the resulting number of particles is again large enough 
to carry on further evolution with a valid statistical description.) We now evolve 
the resulting system to a time t' and compare the result with what would have 
been obtained if we had evolved the original system directly to t'. Obviously, 
the characteristics of the system at small scales (corresponding to the typical 
size R of the clusters at time t) will be quite different. However, at large scales 
(kR <§C 1), the characteristics will be the same both the systems. In other words, 
the effect of a bunch of particles, in a virialized cluster, on the rest of the system 
is described, to the lowest order, by just the monopole moment of the cluster - 
which is taken into account by replacing the cluster by a single particle at the 
center of mass having appropriate mass. In this sense, gravitational interactions 
are "renormalizable" - where the term is used in the specific sense defined above. 

The result has been explicitly verified in simulations |2^| but one must em- 
phasize that the whole idea of numerical simulations of such systems tacitly 
assumes the validity of this result. If the detailed non linear behaviour at small 
scales, say within galaxies, influences very large scale behaviour of the universe 
(say, at super cluster scales), then it will be impossible to simulate large scale 
structure in the universe with finite resolution. 

One may wonder how this feature (renormalizability of gravity) is taken care 
of in equation ((48[). Inside a galaxy cluster, for example, the velocities x-r can 
be quite high and one might think that this could influence the evolution of 5^ 
at all scales. This does not happen and, to the lowest order, the contribution 
from virialized bound clusters cancel in — B^. We shall now provide a proof 
of this result ||. 

We begin by writing the right hand side 7?. of the equation ( p7| ) concentrating 
on the particles in a given cluster. 



where we have used the notation Q = xt for the trajectories of the particles and 
the subscripts a, b, .... — 1, 2, 3 denote the components of the vector. For a set of 
particles which form a bound virialized cluster, we have from ( [f2"| ) the equation 
of motion 




(54) 



Q l + 2-Q l 



a 



1 d<f> 



(55) 
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We multiply this equation by Q J , sum over the particles in the particular cluster 
and symmetrize on i and j, to obtain the equation 

(56) 

We use the summation symbol, rather than integration over q merely to empha- 
size the fact that the sum is over particles of a given cluster. Let us now consider 
the first term in the right hand side of (|54|) with the origin of the coordinate 
system shifted to the center of mass of the cluster. Expanding the exponential 
as e~ lk ' Q « (1 — ik ■ Q) + 0{k 2 R 2 ) where R is the size of the cluster, we find 
that in the first term, proportional to V</), the sum of the forces acting on all 
the particles in the cluster (due to self gravity) vanishes. The second term gives, 
on symmetrization, 

Using ([56|) we find that 

£ *«- 2 (k. W)e-* -Q = + £(k • Q) 2 + \ + 2^|) 5> ■ Q) 2 (58) 

The second term is of order 0(k 2 R 2 ) and can be ignored, giving 

^ ia- 2 (k-V0)e- lk - Q « + J^(k ■ Q) 2 + 0(k 2 R 2 ) (59) 

Consider next the second term in the right hand side of (|54|) with the same 
expansion for the exponential. We get 

^(i/c a )e- lkQ \iQ a k b Q b ] =-J2k a k b Q a Q b {l-ik-Q)+0(k 2 R 2 ) 

= - £(k • Q) 2 + £(k • Q) 2 k a Q a + 0{k 2 R 2 ) (60) 

The second term is effectively zero for any cluster of particles for which Q — > — Q 
is a symmetry. Hence the two terms on the right hand side of (^Jj) cancel each 
other for all particles in the same virialized cluster; that is, the term (A^ — B^) 
receives contribution only from particles which are not bound to any of the 
clusters to the order 0(k 2 R 2 ). If the typical size of the clusters formed at time 
t is R, then for wave-numbers with k 2 R 2 <C 1, we can ignore the contribution 
from the clusters. Hence, in the limit of k — > we can ignore (A^ — B^) term 
and treat equation (|4^) as linear in 5^; large spatial scales in the universe can 
be described by linear perturbation theory even when small spatial scales are 
highly non linear. 

There is, however, an important caveat to this claim. In the right hand side of 
( p8[ ) one is comparing the first term (which is linear in <5k) with the contribution 
(At — -Bk)- If, at the relevant wavenumber, the first term AirGpbdk is negligibly 
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small, then the only contribution will come from (A^ — B^) and, of course, we 
cannot ignore it in this case. The above discussion shows that this contribution 
will scale as k 2 R 2 and will lead to a development of <5k oc k 2 if originally (in 
linear theory) 8k oc k n with n > 2 as k — * 0. 

We shall next describe the linear evolution; the development of 5^ oc k 2 tail 
at large spatial scales will be taken up in section 5.4. 



5.2 Application 2: Evolution at large scales. 

If the density contrasts are small and linear perturbation theory is to be valid, 
we should be able to ignore the terms A k and B k in (|48|). Thus the linear 
perturbation theory in Newtonian limit is governed by the equation 

4 + 2- <5 k = 4iTGp b 5 k (61) 
a 

From the structure of equation ( f48| ) it is clear that we will obtain the linear 
equation if A k <C AirGpbS^ and Bk <C 47rGpb<5k. A necessary condition for this 
(5k 1 but this is not a sufficient condition — a fact often ignored or incorrectly 
treated in literature. As we saw in the last section, if 5^ — > for certain range of 
k at t = tQ (but is nonzero elsewhere) then (A^ — B^) ^> AirGpbSk and the growth 
of perturbations around k will be entirely determined by nonlinear effects. 

For the present, we shall assume that A^ — B^ is ignorable and study the 
resulting system. In that case, equation (^Tj) has the growing solution 5^ (t) = 
[a(i)/a(ii)]^k(^i) in the matter dominated universe with a(t) oc t 2 / 3 ,pt, oc a~ 3 . 
This shows that when linear perturbation theory is applicable the density pertur- 
bations grow as a(t). The power spectrum P(k, t) — \S k (t)\ 2 and the correlation 
function £ (x, t) [which is the Fourier transform of the power spectrum] both grow 
as a 2 (t) while 0k oc k~ 2 (S^/a) remains constant in time. This analysis allows 
us to fix the evolution of clustering at sufficiently large scales (that is, for suffi- 
ciently small k) uniquely. The clustering at these scales which is well described 
by linear theory, and the power spectrum grows as a 2 . 



5.3 Application 3: Formation of first non linear structures 

A useful insight into the nature of linear perturbation theory (as well as nonlin- 
ear clustering) can be obtained by examining the nature of particle trajectories 
which lead to the growth of the density contrast 5l{o) oc a obtained above. To 
determine the particle trajectories corresponding to the linear limit, let us start 
by writing the trajectories in the form 

x T (a,q) = q + L(a,q) (62) 

where q is the Lagrangian coordinate (indicating the original position of the 
particle) and L(a,q) is the displacement. The corresponding Fourier transform 
of the density contrast is given by the general expression 

<5 k (a) = / d 3 qe -,k. q ^k.L(a, q) _ (27r) 3£ D . rac[k] ( 63) 
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In the linear regime, we expect the particles to have moved very little and hence 
we can expand the integrand in the above equation in a Taylor series in (k • L). 
This gives, to the lowest order, 



c (o) S - J tfV~ ik - q ( Z k • L(a, q)) = - J d 3 qe- 4k q (V q • L) 



(64) 



showing that <5k(ct) is Fourier transform of — V q .L(a, q). This allows us to identify 
V • L(a, q) with the original density contrast in real space — <$ q (a). Using the 
Poisson equation we can write Sq(a) as a divergence; that is 



V • L(o, q) - -<5 q (a) = -^H Q 2 aV ■ (V</>) 



(65) 



which, in turn, shows that a consistent set of displacements that will lead to 
5(a) oc a is given by 



L(o,q) = -(W)a = au(q); V = (2/3)/^ 
The trajectories in this limit are, therefore, linear in a: 

x T (a,q) = q + au(q) 



(66) 



(67) 



A useful approximation to describe the quasi linear stages of clustering is 
obtained by using the trajectory in ( |67| ) as an ansatz valid even at quasi lin- 
ear epochs. In this approximation, called Zeldovich approximation, the proper 
Eulerian position r of a particle is related to its Lagrangian position q by 



r(i) = a(t)x(t) = a(t)[q + a(t)u(q)] 



(68) 



where x(i) is the comoving Eulerian coordinate. If the initial, unperturbed, den- 
sity is p (which is independent of q), then the conservation of mass implies that 
the perturbed density will be 



Therefore 
p(r,t) = p 



det 



p(r, t)d 3 r = pafq. 



p/a 3 



Pb(t) 



det(dxj/dqi) det(<5y + a(t)(duj /dqi)) 



(69) 



(70) 



where we have set pb(t) = [p/a 3 (t)]. Since u(q) is a gradient of a scalar function, 
the Jacobian in the denominator of ( f70| ) is the determinant of a real symmetric 
matrix. This matrix can be diagonalised at every point q, to yield a set of 
eigenvalues and principal axes as a function of q. If the eigenvalues of (duj/dqi) 
are [— Ai(q), — A2(q), — Aa(q)] then the perturbed density is given by 



Pb(t) 



(1 - o(t)Ai(q))(l - o(t)A a (q))(l - a(t)A 3 (q)) 



(71) 



Statistical mechanics of gravitating systems 



23 



where q can be expressed as a function of r by solving (J6^). This expression 
describes the effect of deformation of an infinitesimal, cubical, volume (with the 
faces of the cube determined by the eigenvectors corresponding to A„) and the 
consequent change in the density. A positive A denotes collapse and negative A 
signals expansion. 

In a over dense region, the density will become infinite if one of the terms 
in brackets in the denominator of ( |7l| ) becomes zero. In the generic case, these 
eigenvalues will be different from each other; so that we can take, say, Ai > 
A2 > A3. At any particular value of q the density will diverge for the first 
time when (1 — a(i)Ai) = 0; at this instant the material contained in a cube 
in the q space gets compressed to a sheet in the r space, along the principal 
axis corresponding to Ai. Thus sheet like structures, or 'pancakes', will be the 
first nonlinear structures to form when gravitational instability amplifies density 
perturbations. 



5.4 Application 4: A non linear tail at small wavenumber 

There is an interesting and curious result which is characteristic of gravitational 
clustering that can be obtained directly from our equation (|5^). Consider an 
initial power spectrum which has very little power at large scales; more precisely, 
we shall assume that P(k) dies faster than fc 4 for small k. If these large scales 
are described by linear theory — as one would have normally expected, then the 
power at these scales can only grow as a 2 and it will always be sub dominant to 
fc 4 . It turns out that this conclusion is incorrect. As the system evolves, small 
scale nonlinearities will develop in the system and — if the large scales have too 
little power intrinsically (i.e. if n is large) — then the long wavelength power 
will soon be dominated by the "tail" of the short wavelength power arising from 
the nonlinear clustering. This occurs because, in equation (Is)), the nonlinear 
term (Ak - £?k) = 0(k 2 R 2 ) can dominate over AirGpbSk at long wavelengths (as 
k — > 0) and lead to the development of a k A power spectrum at large scales. This 
is a purely non linear effect which we shall now describe. 

To do this, we shall use the Zeldovich approximation to obtain Q| a closed 
equation for <£k- The trajectories in Zeldovich approximation, given by (^) can 



be used in (53) to provide a closed integral equation for 0k- In this case 

'2a\„, , 2 



xr(q,a) = q + aV?/>; x T = ( — ) V^; ^=^2^ (72) 



and, to the same order of accuracy, £?k in (J5(J) becomes: 

J d 3 q (k ■ x T ) 2 e- ik (q+L) S J d 3 q(k ■ x T ) 2 e- lk q (73) 

Substituting these expressions in ( |53| ) we find that the gravitational potential is 
described by the closed integral equation: 

a ■ 1 f d 3 p 

0k + 4-0 k = -^2 J 7^j3^|k + p^ik-pe(k,p) 
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(74) 

This equation provides a powerful method for analyzing non linear clustering 
since estimating (A^ — B^) by Zeldovich approximation has a very large domain 
of applicability. 

A formal way of obtaining the fc 4 tail is to solve equation ((74]) for long 
wavelengths [Q; i.e. near k = 0. Writing k = <p£' + 0^ + .... where (j)^ = <p^ 

(2) 

is the time independent gravitational potential in the linear theory and (pi is 
the next order correction, we get from ((7^), the equation 

The solution to this equation is the sum of a solution to the homogeneous part 
[which decays as <^> oc a~ 4 cx t~ 8 / 3 giving <j) oc t~ 5 ^ 3 } and a particular solution 

(2) 

which grows as a. Ignoring the decaying mode at late times and taking <p) ' = 
aCk one can determine Ck from the above equation. Plugging it back, we find 
the lowest order correction to be, 

^ a -fe)/(^M>.-^.p> <™> 

Near k ~ 0, we have 




(77) 

which is independent of k to the lowest order. Correspondingly the power spec- 
trum for density Pg(k) oc a 2 k 4: P ip (k) tx a 4 fc 4 in this order. 

The generation of long wavelength fc 4 tail is easily seen in simulations if 
one starts with a power spectrum that is sharply peaked in |k|. Figure || shows 
the results of such a simulation (25) in which the y-axis is \A(k)/a(t)] where 
A 2 (k) = k 3 P/2ir 2 is the power per logarithmic band in fc. In linear theory 
A oc a and this quantity should not change. The curves labelled by a = 0.12 to 
a = 20.0 show the effects of nonlinear evolution, especially the development of 
fc 4 tail. 

5.5 Application 5: Generation of small scale power 

Figure H also shows that, as the clustering proceeds, power is generated at spatial 
scales smaller than the scale fc^ 1 at which the power is injected. This feature can 
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Fig. 3. The transfer of power to long wavelengths forming a k 4 tail is illustrated using 
simulation results. Power is injected in the form of a narrow peak at L — 8. Note that 
the y— axis is (A/a) so that there will be no change of shape under linear evolution 
with A tx a. As time goes on a k 4 tail is generated which itself evolves according to 
the nonlinear scaling relation discussed later on 



also be easily understood j23| from our equation (j7(j) . Let the initial gravitational 
potential and the density contrast (in the linear theory) be sharply peaked at 
the wave number ko, say, with: 



% = ^ - fc ]; k%6£ = W|k| - ko] (78) 



where fi is dimensionless constant indicating the strength of the potential and the 
other factors ensure the correct dimensions. Equation ([76]) shows that, the right 
hand side is nonzero only when the magnitudes of both the vectors [(l/2)k + p] 
and [(l/2)k — p] are ko- This requires k p = 0, (k/2) 2 +p 2 = k 2 . (This constraint 
has a simple geometric interpretation: Given any k, with k < 2ko one constructs 
a vector k/2 inside a sphere of radius ko and a vector p perpendicular to k/2 
reaching up to the shell at radius ko where the initial power resides. Obviously, 
this construction is possible only for k < 2ko ) Performing the integration in (fr6|) 
we find that 

,(2) _ fJT fl " 



■ 2 ffi ( k 2 \ ( k 2 



^ -^ 2 ii a \ l -wj\ l + wj (79) 

[We have again ignored the decaying mode which arises as a solution to the 
homogeneous part.] The corresponding power spectrum for the density field 
P(k) = \5 k \ 2 cx a 2 k 4 \(t) k \ 2 will evolve as 

p( 2 )(fc)oc(HV(i-|? 2 )7i + ^ 2 ) 2 ; ? = £ (80) 
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Fig. 4. Analytic model for transfer of power in gravitational clustering. The initial 
power was injected at the wave number ko with a Gaussian window of width Ak/ko = 
0.1. First order calculation shows that the power is transfered to larger spatial scales 
with a fc 4 tail and to the shorter spatial scales, all the way down to (l/2)kg 1 . The plot 
gives the total power spectrum divided by a 2 (with y— axis normalized arbitrarily) at 
different times with a 2 changing by factor 10 between any two curves 



The power at large spatial scales (k — > 0) varies as k 4 as discussed before. The 
power has also been generated at smaller scales in the range fco < k < 2fco with 
p( 2 \k) being a maximum at k m ~ 1.54fco corresponding to the length scale 
^m 1 ~ O-dSkQ 1 . Figure ^ shows the power spectrum for density field (divided 
by a 2 to eliminate linear growth) computed analytically for a narrow Gaussian 
initial power spectrum centered at fco = 1. The curves are for (/j,a/567r 2 ) 2 = 
10 -3 , 1Q~ 2 , 10 _1 and 1. The similarity between figures |3| and |] is striking and 
allows us to understand the simulation results. The key difference is that, in the 
simulations, newly generated power will further produce power at 4fco,8fcoj--- 
and each of these will give rise to a k tail to the right. The resultant power 
will, of course, be more complicated than predicted by our analytic model. The 
generation of power near this maximum at fc" 1 = 0.65fcT is clearly visible as a 
second peak in figure ^ and around 2ir/ko w 4 in figure y . 

If we had taken the initial power spectrum to be Dirac delta function in the 
wave vector k (rather than on the magnitude of the wave vector, as we have 
done) the right hand side of ( fT6| ) will contribute only when (~k±p) = k . This 
requires p = and k = 2k showing that the power is generated exactly at the 
second harmonic of the wave number. Spreading the initial power on a shell of 
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radius ko, spreads the power over different vectors leading to the result obtained 
above. 

Equation (fn|) shows that k^S^ will reach non linearity for [ia w (3/2). The 
situation is different as regards the gravitational potential due to the large nu- 
merical factor 567r 2 ; the gravitational potential fluctuations are comparable to 
the original fluctuations only when [ia ~ 567r 2 . We shall say more about power 
transfer in gravitational clustering in section [?|. 

5.6 Application 6: Spherical approximation 

In the nonlinear regime — when S <; 1 — it is not possible to solve equation (S) 
exactly. Some progress, however, can be made if we assume that the trajectories 
are homogeneous; i.e. x(t, q) = /(t)q where f(t) is to be determined. In this 
case, the density contrast is 

S k (t) = J d 3 qe- l/(t)k - q - (2^) 3 Mk) = (2^) 3 Mk)[.r 3 - 1] = (27r) 3 6 D (k)S(t) 

(81) 

where we have defined S(t) = [/ 3 (t) — l] as the amplitude of the density con- 
trast for the k = mode. It is now straightforward to compute A and B in (Q) . 
We have 

A = 4irGp b 6 2 (t)[(2ir) 3 6 D (k)} (82) 



and 



B = j d 3 q(k a q a ) 2 f 2 e^ k ^ =-/ 2 ^[(27r) 3 M/k)] 

= -l^[(2^S Dm (83) 
so that the equation (|I^) becomes 

§ + 2-8 = A-KGp b {l+8)5+^ 7 ^— (84) 
a 3 (1 + o) 

To understand what this equation means, let us consider, at some initial epoch 
U, a spherical region of the universe which has a slight constant over-density 
compared to the background. As the universe expands, the over-dense region 
will expand more slowly compared to the background, will reach a maximum 
radius, contract and virialize to form a bound nonlinear system. Such a model 
is called "spherical top-hat" . It is possible to show that equation (J8J) is the 
same as the equation governing density evolution in a spherical model J4| . . The 
detailed analysis of the spherical model shows that the virialized systems 
formed at any given time have a mean density which is typically 200 times the 
background density of the universe at that time. Hence, it is often convenient 
to divide the growth of structures into three regimes: linear regime with S -C 1; 
quasi-linear regime with 1 5 < 200; and nonlinear regime with S <; 200. 
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6 Nonlinear scaling relations 



Given an initial density contrast, one can trivially obtain the two point correla- 
tion function at any later epoch in the linear theory. If there is a procedure for 
relating the nonlinear correlation function and linear correlation function (even 
approximately) then one can make considerable progress in understanding non- 
linear clustering. It is actually possible to do this (^7), (2?|, [j30| and relate 
the exact mean correlation function 

Z(t,x) = ^ [ X t(t,y)y 2 dy (85) 
x Jo 

to the one computed in the linear theory along the following lines: 

The mean number of neighbours within a distance x of any given particle is 

N(x, t) = (na 3 ) f ^y 2 dy[l + £(y, t)} (86) 

J o 

when n is the comoving number density. Hence the conservation law for pairs 
implies 

i + S5W+<W-0 (87) 

where v(t, x) denotes the mean relative velocity of pairs at separation x and 
epoch t. Using 

i 1+ Q = &m^ 1+ ® (88) 



we get 



1 d 3 d - Id 
Integrating, we find: 

[The integration would allow the addition of an arbitrary function of t on the 
right hand side. We have set this function to zero so as to reproduce the correct 
limiting behaviour]. It is now convenient to change the variables from t to a, 
thereby getting an equation for £: 

^ 1 + ^^=(i)^I^ 1+ ^^ ^ 



or, defining h(a,x) = —(v/ax) 
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This equation shows that the behaviour of £(a, x) is essentially decided by h, the 
dimcnsionlcss ratio between the mean relative velocity v and the Hubble velocity 
ax = (a/a)r, both evaluated at scale x. 

In the extreme nonlinear limit K> l) , we may expect bound structures not 
to expand with Hubble flow. To maintain a stable structure, the relative pair 
velocity v (a,x) of particles separated by x should balance the Hubble velocity 
Hr = ax; hence, v = —ax or h(a,x) = 1. The behaviour of h(a,x) for ( « 1 
is more complicated and can be shown [§},[^2| that h = (2/3)£ in the limit of 
( < 1. Thus h(a,x) depends on (a, x) only through £(a,x) in the linear limit, 
while h = —1 is the nonlinear limit. This suggests the ansatz that h depends 
on a and x only through ^(a,x); that is, we assume that h(a,x) = h[^(a,x)]. It 
is then possible to find a solution to ( |92| ) which reduces to the form £ oc a 2 for 
£ <C 1 using the method of characteristics |^]. The final result can be given as: 

( 1 f^ a - x '> dn \ - 

bM="*[sj mrw))^ i=^+^r 3 - (93) 

Given the function this relates £j, and £ or — equivalently — gives the 

mapping £(a, x) = J7[£i(a, I)] between the nonlinear and linear correlation func- 
tions evaluated at different scales x and I. The lower limit of the integral is 
chosen to give ln£ for small values of £ on the linear regime. [The (2/3) factor 
in the exponent becomes (2/13) in D-dimensions and I — x[l + £(a, a;)] 1 /' .] 

The following points need to be stressed regarding this result: (i) Among all 
statistical indicators, it is only £ which obeys a nonlinear scaling relation (NSR) 
of the form £nl(o>,x) — U [£l(<i, /)]. Attempts to write similar relations for £ 
or P(k) have no fundamental justification, (ii) The non locality of the relation 
represents the transfer of power in gravitational clustering and cannot be ignored 
— or approximated by a local relation between £_/vx(a, x) and £l(o, x). 

Given the form of /i(£~), equation ( |9^ ) determines the relation £ = It 
is, however, easier to determine the form of U, directly from theory and this 
was done in ps[ |. Here, I shall provide a more intuitive but physically motivated 
derivation along the following lines: 

In the linear regime (£ < < 1)) we clearly have U(^l) — £l- We can 
divide the non linear phases of evolution conveniently into two parts, which I 
will call quasi-linear (1 <, £ < 200) and non linear (200 ^ £). In the quasi- linear 
phase, regions of high density contrast will undergo collapse and in the non linear 
phase more and more virialized systems will get formed. We recall that, in the 
study of finite gravitating systems made of point particles and interacting via 
Newtonian gravity, isothermal spheres play an important role and are the local 
maxima of entropy. Hence dynamical evolution drives the system towards an 
(1/x 2 ) density profile. Since one expects similar considerations to hold at small 
scales, during the late stages of evolution of the universe, we may hope that 
isothermal spheres with (1/x 2 ) profile may still play a role in the late stages of 
evolution of clustering in an expanding background. However, while converting 
the density profile to correlation function, we need to distinguish between two 
cases. In the quasi-linear regime, dominated by the collapse of high density 
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peaks, the density profile around any peak will scale as the correlation function 
and we will have £ oc (1/x 2 ). On the other hand, in the nonlinear end, we will be 
probing the structure inside a single halo and £(x) will vary as < p(x + y)p(y) >. 
If p oc |x| _e , then £ oc |x| -7 with 7 = 2e — 3. This gives £ oc (1/x) for e = 2. 
Thus, if isothermal spheres are the generic contributors, then we expect the 
correlation function to vary as (1/x) and nonlinear scales, steepening to (1/x 2 ) 
at intermediate scales. Further, since isothermal spheres are local maxima of 
entropy, a configuration like this should remain undistorted for a long duration. 
This argument suggests that a £ which goes as (1/x) at small scales and (1/x 2 ) 
at intermediate scales is likely to grow approximately as a 2 at all scales. The 
form of U[£l] must be consistent with this feature. 

This criterion allows us to determine the form of U in the quasi-linear and non 
linear phases. In the quasi-linear regime, we want a linear correlation function of 
the form £t(a, I) = a 2 /I 2 to be mapped to £~NL(a,x) oc a 2 /x 2 . We thus demand 
U[a 2 l~ 2 ] oc a 2 x~ 2 with I ps x^if- It is trivial to see that this requires U(z) oc z 3 . 
Similarly, in the non linear end, we expect a linear correlation function of the 
form £z,(a, I) = a 2 /I to be mapped to £nl(», x) oc a 2 /x requiring [7[a 2 Z _1 ] oc 
a 2 x~ 1 with I PS x$*. This gives U(z) oc z 3 ! 2 . 

Combining all the results we find that the nonlinear mean correlation function 
can be expressed in terms of the linear mean correlation function by the relation: 



The numerical coefficients have been determined by continuity arguments. We 
have assumed the linear result to be valid up to £ = 1 and the virialisation 
to occur at £ pa 200 which is result arising from the spherical model. The true 
test of such a model, of course, is N-body simulations and remarkably enough, 
simulations are very well represented by relations of the above form. [The fact 
that numerical simulations show a correlation between £(a,x) and £i(a, I) was 
originally pointed out in j27j. These authors, however, gave a multi parameter fit 
to the data which has the virtue of representing the numerical work accurately. 
Equation (|94|), on the other hand, portrays the clear physical interpretation 
behind the result.] The exact values of the numerical coefficients can be obtained 
from simulations and it changes 14.14 to 11.7 and 200 to 125 in the above 
relations. 

The result we have obtained for the non linear end corresponds to an assump- 
tion called stable clustering . If we ignore the effect of mergers, then it seems 
reasonable that virialized systems should maintain their densities and sizes in 
proper coordinates, i.e. the clustering should be "stable" . This would require 
the correlation function to have the form £atl(o,x) = a 3 F(ax). [The factor a 3 
arising from the decrease in background density] . One can easily show that this 

3 /2 

will lead to U oc £ L in the non linear regime. Another way deriving this result 
is to note that if the proper size of the objects do not change with time, then 




3/2 



(for ^ < 1, £ < 1) 

(for 1< £ L < 5.85, 1< £ < 200) 

(for 5.85 < £ L , 200 < £) 



(94) 
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r = which implies, statistically, v = —ax requiring h = 1. Integrating ( p3| ) 

3/2 

with appropriate boundary condition leads to U cx £ L ' . 

In case mergers of structures are important, one would consider this assump- 
tion to be suspect We can, however, generalize the above argument in the 
following manner: If the virialized systems have reached stationarity in the sta- 
tistical sense, then it seems reasonable to assume that the function h — which 
is the ratio between two velocities — should reach some constant value. In that 
case, one can integrate (|9^ ) and obtain the result £jvl = a 3h F(a h x) where h 
now denotes the asymptotic value. A similar argument will now show that 

Z NL (a,x) cx Ma,l)] 3h/2 (95) 

in the general case. If the linear power spectrum has an index n (with Pl(&) cx 
k n ,£,L oc x~(™ +3 - ) ) then one would get 

«„,*) K .«^»-i , = ^0^ (%) 

Simulations are not accurate enough to fix the value of h; in particular, the 
asymptotic value of h could depend on n within the accuracy of the simulations. 

It is possible to obtain similar relations between £(a, x) and £z,(a, I) in two di- 
mensions as well by repeating the above analysis J3^| . In 2-D the scaling relations 
turn out to be 

!( L (a, I) (Linear) 
^ L (a,l) 2 (Quasi-linear) (97) 
l L {a,l) h (Nonlinear) 

where h again denotes the asymptotic value. For power law spectrum the nonlin- 
ear correction function will £,Nh{a, x) — a 2 ~~ l x~~ t with 7 = 2(n + 2)/(ri + 4). If we 
generalize the concept of stable clustering to mean constancy of h in the nonlin- 
ear epoch, then the correlation function will behave as ^nl(ci,x) — a 2h F(a h x). 
In this case, if the spectrum is a power law then the nonlinear and linear indices 
are related to 

2h(n + 2) 

7 2 + h(n + 2) l98j 
[In general, £ cx £f in the quasi-linear regime and £ oc £^ h ^ 2 in the nonlinear 



regime 1 33 where D is the dimension of space] . All the features discussed in the 
case of 3 dimensions are present here as well. For example, if the asymptotic 
value of h scales with n such that h(n + 2) = constant then the nonlinear index 
will be independent of the linear index. Figure ^ shows the results of numerical 
simulation in 2D, which suggests that h = 3/4 asymptotically |34| 

The ideas presented here can be generalized in two obvious directions |pq| : 
(i) By considering peaks of different heights, drawn from an initial Gaussian 
random field, and averaging over the probability distribution one can obtain a 
more precise NSR. (ii) By using a generalized ansatz for higher order correlation 
functions, one can attempt to compute different statistical parameters in the 
quasi linear and nonlinear regimes. 
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Fig. 5. The NSR in 2D giving f (x) as a function of the linear mean correlation function 
The theoretical predictions at the three regimes are shown by solid lines of slopes 
1, 2, 3/4. The data points are from simulations for different power spectra and different 
epochs. The overlapping of data points shows the existence of an NSR and its agreement 
with the theoretical prediction lends support to the model discussed in the text. The 
error bars indicate the typical accuracy of the result 



The entire approach side steps three important issues which needs to be 
investigated more thoroughly than has been done in the literature. 

• There has been no "first-principle" derivation of the non linear scaling re- 
lations in spite of several attempts by different groups. (The NSR does not 
depend on G, for example!) For example, I have not obtained the NSR from 
the basic equation (f53| ) derived earlier in section || because I (or anyone else) 
do not know how to go about doing this. 

• A strong constraint any correlation function must satisfy is that its Fourier 
transform must be positive definite. The necessary and sufficient condition 
for a function f(x) to have a positive definite Fourier transform is that it 
must be a convolution, viz., it must be expressible as an integral over y of a 



Statistical mechanics of gravitating systems 33 

product g(x + y)g(y) where g is some other function. In the NSR, one starts 
with a linear correlation function and maps it by a functional transform 
to obtain a non linear correlation function. There is no guarantee that this 
functional transform will actually lead to a valid correlation function in the 
sense that it will have a positive definite Fourier transform. In fact, it will 
not except approximately; it is not clear what are the implications of this 
result. 

• Both £ and £ can be negative for a range of scale. The NSR described above 
is not applicable to power spectra with negative correlation function. It is 
possible to generalize the formalism to cover this situation but such an ex- 
tension leads to results which are fairly counter intuitive ]30| . 



7 Critical indices and power transfer in gravitational 
clustering 

Given a model for the evolution of the power spectra in the quasi linear and 
nonlinear regimes, one could explore whether evolution of gravitational clustering 
possesses any universal characteristics. The derivation of NSR in the previous 
section has encoded in it the feature that n = — 1 and n = — 2 (in the power 
spectrum with n defined through the relation P cx k n ) will appear as "critical 
indices" in the quasi-linear and non linear regimes respectively, in the sense that 
£ oc a 2 for these indices (which matches with the linear evolution, £i oc a 2 ). 

This suggests even a stronger result: any generic initial power spectrum will 
be driven to a particular form of power spectrum in the late stages of the evo- 
lution. Such a possibility arises because of the following reason: We see from 
equation (|^|) that [in the quasi linear regime] spectra with n < — 1 grow faster 
than a 2 while spectra with n > — 1 grow slower than a 2 . This feature could drive 
the spectral index to n — n c ps —1 in the quasi linear regime irrespective of 
the initial index. Similarly, the index in the nonlinear regime could be driven to 
n ~ —2 during the late time evolution. So the spectral indices —1 and —2 are 
some kind of "fixed points" in the quasi linear and nonlinear regimes. 

This effect can be understood better by studying the "effective" index for 
the power spectra at different stages of the evolution [ p5| . Let us define a local 
index for rate of clustering by 

. . <91n£(a,x) , . 

n a a,x)= „y ' 99 
din a 

which measures how fast £(a,x) is growing. When £(a,x) <C 1, then n a = 2 
irrespective of the spatial variation of £(a,x) and the evolution preserves the 
shape of £(a,x). However, as clustering develops, the growth rate will depend 
on the spatial variation of £(a, x). Defining the effective spatial slope by 



(100) 
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one can rewrite the equation as 



n a =h( 77 ^—-n x ) (101) 

At any given scale of nonlinearity, decided by £(a, x), there exists a critical spatial 
slope n c such that n a > 2 for n x < n c [implying rate of growth is faster than 
predicted by linear theory] and n a < 2 for n x > n c [with the rate of growth 
being slower than predicted by linear theory]. The critical index n c is fixed by 



setting n a = 2 in equation (101) at any instant. This requirement is established 
from the physically motivated desire to have a form of the two point correlation 
function that remains invariant under time evolution. Since the linear end of 
the two point correlation function scales as a 2 , the required invariance of form 
constrains the index n a to be 2 at all scales . The fact that n a > 2 for n x < n c and 
n a < 2 for n x > n c will tend to "straighten out" correlation functions towards 
the critical slope. [We are assuming that £(a,x) has a slope that is decreasing 
with scale, which is true for any physically interesting case]. From the NSR it is 
easy to see that in the range 1 ^ £ Si 200, the critical index is n c ss — 1 and 
for 200 £, the critical index is n c « —2. This clearly suggests that the local 
effect of evolution is to drive the correlation function to have a shape with (1/x) 
behaviour at nonlinear regime and (1/x 2 ) in the intermediate regime. Such a 
correlation function will have n a ~ 2 and hence will grow at a rate close to a 2 . 

To go from the scalings in two limits to an actual profile, we can use some 
fitting function. One possible interpolation f3(| between the two limits is given 
by: 

with L, B being constants. If we evolve this profile (with the optimum value of 
B = 38.6) from a 2 = 1 to a 2 « 1000 using the NSR, and plot [£(a, x)/a 2 ] against 
x, then the curves virtually fall on top of each other within about 10 per cent 
[ ^C| . This shows that the profile does grow approximately as a 2 . 

These considerations also allow us to predict the nature of power transfer in 
gravitational clustering. Suppose that, initially, the power spectrum was sharply 
peaked at some scale fco = 2tt/Lq and has a small width Ak. When the peak 
amplitude of the spectrum is far less than unity, the evolution will be described 
by linear theory and there will be no flow of power to other scales. But once 
the peak approaches a value close to unity, power will be generated at other 
scales due to nonlinear couplings even though the amplitude of perturbations in 
these scales are less than unity. For modes with no initial power [or exponentially 
small power], nonlinear coupling provides the only driving term in fl48[) . These 
generate power at the scale k through mode-coupling, provided power exists at 



some other scale. [We saw this effect in section 5.5 for a simple model.] 

In x— space, this arises along the following lines: If the initial spectrum is 
sharply peaked at some scale L , first structures to form are voids with a typical 
diameter Lq. Formation and fragmentation of sheets bounding the voids lead to 
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generation of power at scales L < Lq. First bound structures will then form at 
the mass scale corresponding to Lq. In such a model, £u n at L < Lq is nearly 
constant with an effective index of n ~ —3. Assuming we can use equation ( pi} ) 
with the local index in this case, we expect the power to grow very rapidly as 
compared to the linear rate of a 2 . [The rate of growth is a 6 for n = —3 and a 4 
for n = —2.5.] Different rate of growth for regions with different local index will 
lead to steepening of the power spectrum and an eventual slowing down of the 
rate of growth. In this process, which is the dominant one, the power transfer is 
mostly from large scales to small scales. [Except for the gene ratio n of the k A tail 
at large scales which we have discussed earlier in subsection 5.4.] 
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Fig. 6. The transfer of power in gravitational clustering 



From our previous discussion, we would have expected such an evolution 
to lead to a "universal" power spectrum with some critical index n c rj — 1 
for which the rate of growth is that of linear theory - viz., a 2 . In fact, the 
same results should hold even when there exists small scale power; numerical 
simulations confirm this prediction and show that - in the quasi linear regime, 
with 1 < S < 100 - power spectrum indeed has a universal slope (see figures ^ 
^; for more details, see p5|]). 

The initial power spectrum for figure |(] was a Gaussian peaked at the scale 
fco = ^/Lq;Lq = 24 and having a spread Ak = 27r/128. The amplitude of 
the peak was chosen so that Au n (ko = 27r/Lo,a = 0.25) = 1, where A 2 (k) = 
k 3 P(k) /(27r 2 ) and P(k) is the power spectrum. The y-axis is A(k)/a, the power 
per logarithmic scale divided by the linear growth factor. This is plotted as a 
function of scale L — 2ir/k for different values of scale factor a(t) and the curves 
are labeled by the value of a. As we have divided the power spectrum by its linear 
rate of growth, the change of shape of the spectrum occurs strictly because of 
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Fig. 7. The growth of gravitational clustering towards a universal power spectrum 
P(k) oc AT 1 

non-linear mode coupling. It is clear from this figure that power at small scales 
grows rapidly and saturates at the growth rate close to the linear rate [shown by 
crowding of curves] at later epochs. The effective index for the power spectrum 
approaches n = — 1 within the accuracy of the simulations. This figure clearly 
demonstrates the general features we expected from our understanding of scaling 
relations. 

Figure ^ compares power spectra of three different models at a late epoch. 
Model I was described in the last para; Model II had initial power concentrated 
in two narrow windows in /c-space. In addition to power around Lq = 24 as in 
model I, we added power at k\ = 2n/Li;Li = 8 using a Gaussian with same 
width as that used in model I. Amplitude at L\ was chosen five times higher than 
that at Lq — 24, thus Au n (ki,a = 0.05) = I. Model III was similar to model 
II, with the small scale peak shifted to k\ = 2n/Li;Li = 12. The amplitude of 
the small scale peak was the same as in Model II. At this epoch Au n (ko) = 4.5 
and it is clear from this figure that the power spectra of these models are very 
similar to one another. 

8 Universal behaviour of gravitational clustering in the 
asymptotic limit 

In the study of isolated gravitating systems, it is easy to arrive at a broad picture 
of the late time structure of the system. We have seen that it will be made of 
a very compact core and a diffuse halo and — if the system is confined by a 
reflecting wall — then this state could be thought of as being made of two 
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distinct phases as described in section |^. By and large, all memory of initial 
stage could have been wiped out in the asymptotic steady state. 

The situation is much more complicated in the case of gravitational clustering 
in an expanding background. To begin with, defining an asymptotic state in a 
universe which is expanding according to a(t) cx t 2 / 3 itself is problematic. Taking 
a strict t —> oo limit is meaningless and hence one is interested in time-scales 
which are finite but much longer than other time scales in the problem. We 
need to make this notion more precise. Secondly, we saw in section 5.1 that the 
virialized cores and the diffuse halo are only weakly coupled in the cosmological 
context. Hence the memory of initial conditions cannot be wiped out as easily in 
this case as, for example, in the context of systems with short range interaction 
or even in the context of gravitating systems in static backgrounds. We will now 
discuss some of these issues. 

Let us start with a time t = ti at which the density fluctuations 8k is much 
smaller than unity at all scales. (If necessary, we shall assume that there are 
cut-offs at small and large scales.) The evolution in the initial stages can then 
be understood by linear theory developed in section |5.2| ; the power at all scales 
grow as a 2 and the power spectrum maintains its shape. As time goes on, the 
density contrast at some scale will hit unity and structures corresponding to that 
characteristic scale will begin to get formed. If the power spectrum decreases with 
fc" 1 (so that there is more power at small spatial scales), then small scales will 
go non linear first. 

Once the non linear structure has formed, we need to study the evolution 
at different wave numb ers differently. If k n \ is the scale at which S w 1, then 
our analysis in section 5T shows that the evolution is still linear at k <C k n \ 
provided the initial power was more than k 4 ; if not, a k 4 tail develops rapidly at 
these large spatial scales. At k 3> fc n i, virialized structures would have formed, 
which are fairly immune to overall expansion of the universe. The discussion in 
section 5.1 shows that these scales also do not affect the evolution at smaller k. 
In this sense, the universe decouples nicely into a clustered component and an 
unclustered one which do not affect each other to the lowest order. The situation 
is most complex for k « fc n i since neither of the approximations are possible. 

The critical scale k n i itself is a function of time and, in the popular cosmolog- 
ical models, keeps decreasing with time; that is, larger and larger spatial scales 
go non linear as time goes on. In such an intrinsically time dependent situation, 
one could ask for different kinds of universal behaviour and it is important to 
distinguish between them. 

To begin with, one could examine whether the power spectrum (or the corre- 
lation function) has a universal shape and evolution at late times, independent 
of initial power spectrum. We have seen in section [7] that the late time power 
spectrum does have a universal behaviour if the initial spectrum was sharply 
peaked. In this case, at length scales smaller than the initial scale at which the 
power is injected, one obtains two critical indices n = — 1 and —2 and the two 
point correlation function has an approximate shape of (102) or even simpler, 
£,(a,x) cx a 2 a;^ 1 (/ + x)~ 4 where I is the length scale at which £ w 200. At scales 
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bigger than the scale at which power was originally injected, the spectrum de- 
velops a k 4 tail. 

But if the initial spectrum is not sharply peaked, each band of power evolves 
by this rule and the final result is a lot messier. The NSR developed in section 
U allows one to tackle this situation and ask how the non linear scales will 
behave. Here one of the most popular assumptions used in the literature is that 
of stable clustering which requires v = — ax (or h = 1) for sufficiently large 
£(a,x). Integrating equation ( p3|) with h = 1, we get £(a,z) = £^ 2 (a,x). If 
Pi n (k) tx k n so that £i(a,a;) oc a 2 x~( n+3 \ then £(a,x) at nonlinear scales will 
vary as 

^(a,x) (xa^x'^Sri 1 ; (f > 200) (103) 

if stable clustering is true. Clearly, the power law index in the nonlinear regime 
"remembers" the initial index. The same result holds for more general initial 
conditions. If stable clustering is valid, then the late time behaviour of £(a, x) 
is strongly dependent on the initial conditions. The two (apparently reasonable) 
requirements: (i) validity of stable clustering at highly nonlinear scales and (ii) 
the independence of late time behaviour from initial conditions, are mutually 
exclusive. 

This is yet another peculiarity of gravity in the context of expanding back- 
ground. The initial conditions are not forgotten — unlike in systems with short 
range interactions or even in the context of gravitating systems in static back- 
ground. The physical reason for this feature is the weak coupling between the 



virialized "cores" and diffuse "halos", described in section 5.1. This weakness 
of the coupling keeps the memory of the initial conditions alive as the system 
evolves. 

While this conclusion is most probably correct, the assumption of stable 
clustering may not be true. In fact, there is some evidence from simulations that 
this assumption is not true pl| . In that context, another natural assumption 
which could replace stable clustering, will be the following: We assume that h 
reaches a constant value asymptotically which is not necessarily unity. Then we 
get £,(a,x) — a 3h F[a h x] where h now denotes the constant asymptotic value of 
of the function. For an initial spectrum which is scale-free power law with index 
n, this result translates to 

• (104) 

where 7 is given by 

Jhin + 3) 

' 2 + h(n + 3) y ' 

One can obtain a 7 which is independent of initial power law index provided h 
satisfies the condition h[n + 3) = c, a constant. Unfortunately, simulations are 
not good enough yet to test this conclusion in the really asymptotic domain. 

The second question one could ask, concerns the density profiles of individual 
virialized halos which, of course, is related to the behaviour of the correlation 
function |3q|, [B7j, [B8|. To focus on this relation, let us start with the simple 
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assumption that the density field p(a, x) at late stages can be expressed as a 
superposition of several halos, each with some density profile; that is, we take 

P(a,x) = ^2f(x~*i,a) (106) 

where the i-th halo is centered at Xj and contributes an amount /(x — x,-, a) at 
the location x, [We can easily generalize this equation to the situation in which 
there are halos with different properties, like core radius, mass etc by summing 
over the number density of objects with particular properties; we shall not bother 
to do this. At the other extreme, the exact description merely corresponds to 
taking the f's to be Dirac delta functions. Hence there is no loss of generality 



in (106)]. The power spectrum for the density contrast, 5(a,x) = (p/pt — 1), 



corresponding to the p(a, x) in (106) can be expressed as 



P(k,a) = ( a 3 |/(k,a)|) 2 



exp — zk • Xj(a) 



= (a 3 |/(k,a)|) 2 P cont (k,a) 



(107) 

where P cent (k, a) denotes the power spectrum of the distribution of centers of 



the halos. This equation (107) can be used directly to probe the nature of halo 
profiles along the following lines. If the correlation function varies as | oc x -7 , 
the correlation function of the centers vary as £ cx x~ lc and the individual 
profiles are of the form f(x) oc x~ e , then the relation P(k) = \f{k)\ 2 P cen t{k) 
translates to 

e = 3+i( 7 -7c) (108) 

At very non linear scales, the centers of the virialized clusters will coincide with 
the deep minima of the gravitational potential. Hence the power spectrum of the 
centers will be proportional to the power spectrum of the gravitational potential 
P<f>(k) oc fc"^ 4 if P(k) oc k n . Since the correlation functions vary as x~^ a+3 ^ when 



the power spectra vary as k a , it follows that 7 — — 4. Substituting into (108) 
we find that e = 1 at the extreme non linear scales. On the other hand, in the 
quasi-linear regime, reasonably large density regions will act as cluster centers 
and hence one would expect P ce nt(fc) and P(k) to scale in a similar fashion. This 
leads to 7 ~ 7c, giving e « 3. So we would expect the halo profile to vary as 
x~ x at small scales steepening to x -3 at large scales. A simple interpolation for 
such a density profile will be 

^ K (109) 

Such a profile, usually called NFW profile Pq| , is often used in cosmology though 
I have not come across the simple theoretical argument given above in the liter- 
ature. Unfortunately, it is not possible to get this result from more detailed and 
transparent arguments. 

In general, at very nonlinear scales, the correlation function probes the profile 



of individual halos and P cen t ~ constant implying 7c = 3 in equation (108). At 
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these scales, we get 

7 = 2e-3 (110) 



For the situation considered in equation ( 105 ), with h(n + 3) = c, the halo profile 
will have the index 

'c+r 



' = 3 {—2> 

which is independent of initial conditions. Note that we are now demanding the 
asymptotic value of h to explicitly depend on the initial conditions though the 
spatial dependence of £(a, x) does not. In other words, the velocity distribution — 
which is related to h — still "remembers" the initial conditions. This is indirectly 
reflected in the fact that the growth of £(a, x) — represented by a 6c /^ 2+c ^ n+3 ^ 
— does depend on the index n. 

As an example of the power of such a — seemingly simple — analysis note the 
following: Since c > 0, it follows that e > (3/2); invariant profiles with shallower 
indices (for e.g with e = 1) discussed above are not consistent with the evolution 
described above. 

While the above arguments are suggestive, they are far from conclusive. It 
is, however, clear from the above analysis and it is not easy to provide unique 
theoretical reasoning regarding the shapes of the halos. The situation gets more 
complicated if we include the fact that all halos will not all have the same mass, 
core radius etc and we have to modify our equations by integrating over the 
abundance of halos with a given value of mass, core radius etc. This brings 
in more ambiguities and depending on the assumptions we make for each of 
these components and the final results have no real significance. The issue is 
theoretically wide open. 
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Appendix A: Basic Cosmology 

I summarize in this appendix some of the key equations in cosmology which 
are used in the review. More detailed discussions can be found in standard text 
books §. 

The cosmological model which we use in our discussion is described by a 
spacetime interval of the form 

ds 2 = dt 2 - a 2 (t)dx 2 (112) 

where a(t) is called the expansion factor which describes the rate of expansion 
of the universe and the units are chosen so that c = 1. The form of a(t) is 
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determined by the energy density present in the universe and is determined 
through the equations 

la? - - (^-pbcA = 0; d{p b a 3 ) = ~pda 3 (113) 
2 a \ 3 / 

where pb is the energy density of matter and p is the pressure related to pb by 
an equation of state of the form p = p(p b ). This equation of state, along with 



the second equation in (113) determines p b as a function of a. Substituting in 
the first equation in (113), one can determine a(t). As a mnemonic (and only 
as a mnemonic) , one can think of the first equation as giving the sum of kinetic 
energy and potential energy of the universe to be zero for the universe and the 
second equation as an equivalent of dE = —pdV. There are also equivalent to 
the relation (a/a) = — (4%G/3)pb(t) 

Non relativistic matter moving at speeds far less than speed of light (v <C 
c) will have the pressure p pbv 2 negligibly small compared to the energy 
density pbC 2 . We can then set p w obtaining pb oc a~ 3 and a oc t 2 / 3 . Since an 
overall multiplication constant in a can be absorbed by rescaling the lengths, the 
normalization of a(t) is arbitrary. It is conventional to take a = (t/to) 2 ^ 3 with 
to denoting the current age of the universe. The rate of expansion at present is 
called the Hubble constant and is defined by 

tfo (H4) 

Observationally, Hq « 75 km s _1 Mpc -1 . This defines a natural time scale 
tii = Hq 1 w 10 10 years and a length scale d H = cHq 1 w 4000 Mpc. Note that 
H and the energy density are related by po = (3-ffp / 8ttG) . 

The general relativistic effects of gravity are felt over length and time scales 
comparable to tg,dn. Since the typical separation between galaxies is about 1 
Mpc and the size of large superclusters in the universe is about 100 Mpc, the 
general relativistic effects are not important at the present epoch for most of 



the issues in structure formation. Equation ( 112 ) then suggests that if we use 
the coordinate r = a(f)x, the physical laws in an expanding universe will take 
more familiar form at small scales. (The r is called the proper coordinate, while 
x is called the comoving coordinate.) For example, gravity can be described by 
Newtonian theory in proper coordinates with the cosmic background providing 
an extra gravitational potential <?frw = ~{2nG/3)pb(t)r 2 corresponding to a 
uniform distribution of matter. 

When matter expands with the universe homogeneously, the proper coordi- 
nate separation between any two points vary as r = dx = (d/a)r. (This is called 
Hubble expansion with v = Hr.) In this case, 

f = --r = ^-p b (t)r = -V r <?FRW (H5) 
a 3 

Gravitational clustering and growth of inhomogeneities require particles to move 
relative to the cosmic expansion with x ^ 0. 
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In the study of structure formation, the central quantity one uses is the 
density contrast defined as S(t,x) — [p(t,x) — pb(t)]/pb(t) which characterizes 
the fractional change in the energy density compared to the background. (Here 
Pb(t) —< p(t,x.) > is the mean background density.) Since one is often inter- 
ested in the statistical behaviour of structures in the universe, it is conventional 
to assume that S and other related quantities are elements of an ensemble. Many 
popular models of structure formation suggest that the initial density pertur- 
bations in the early universe can be represented as a Gaussian random variable 
with zero mean (that is, < 5 >= 0) and a given initial power spectrum. The 
latter quantity is defined through the relation P(t,k) =< |<5fc(i)| 2 > where 5k is 
the Fourier transform of <5(i,x). It is also conventional to define the two-point 
correlation function £(i, x) and the mean correlation function £,(t,x) via the 
equations 

£(t, x) = j- r P(t, k) e* kx ; f (i, x) = -j / ^(t, V) y 2 dy (116) 
J (2tt) x 1 J 

Though gravitational clustering will make the density contrast non Gaussian at 
late times, the power spectrum and the correlation functions continue to be of 
primary importance in the study of structure formation. 
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